Candidate genes for infertility: an in-silico study based on cytogenetic analysis

Background The cause of infertility remains unclear in a significant proportion of reproductive-age couples who fail to conceive naturally. Chromosomal aberrations have been identified as one of the main genetic causes of male and female infertility. Structural chromosomal aberrations may disrupt the functioning of various genes, some of which may be important for fertility. The present study aims to identify candidate genes and putative functional interaction networks involved in male and female infertility using cytogenetic data from cultured peripheral blood lymphocytes of infertile patients. Methods Karyotypic analyses was done in 201 infertile patients (100 males and 101 females) and 201 age and gender matched healthy controls (100 males and 101 females) after 72 h peripheral lymphocyte culturing and GTG banding, followed by bioinformatic analysis using Cytoscape v3.8.2 and Metascape. Results Several chromosomal regions with a significantly higher frequency of structural aberrations were identified in the infertile males (5q2, 10q2, and 17q2) and females (6q2, 16q2, and Xq2). Segregation of the patients based on type of infertility (primary v/s secondary infertility) led to the identification of chromosomal regions with a significantly higher frequency of structural aberrations exclusively within the infertile males (5q2, 17q2) and females (16q2) with primary infertility. Cytoscape identified two networks specific to these regions: a male specific network with 99 genes and a female specific network with 109 genes. The top enriched GO terms within the male and female infertility networks were “skeletal system morphogenesis” and “mRNA transport” respectively. PSME3, PSMD3, and CDC27 were the top 3 hub genes identified within the male infertility network. Similarly, UPF3B, IRF8, and PSMB1 were the top 3 hub genes identified with the female infertility network. Among the hub genes identified in the male- and female-specific networks, PSMB1, PSMD3, and PSME3 are functional components of the proteasome complex. These hub genes have a limited number of reports related to their respective roles in maintenance of fertility in mice model and humans and require validation in further studies. Conclusion The candidate genes predicted in the present study can serve as targets for future research on infertility. Supplementary Information The online version contains supplementary material available at 10.1186/s12920-022-01320-x.

single gene disorders, and multifactorial causes [2]. Among the genetic factors, chromosomal anomalies have been identified as one of the main causes of male infertility [4]. With at least 2000 genes believed to be involved in spermatogenesis, the number of genetic anomalies associated with male infertility is growing steadily [5].
On the other hand, female infertility can be caused by developmental, endocrine, immunological, metabolic, microbial, surgical or genetic factors [6,7]. The genetic causes of female infertility include chromosomal aberrations due to meiotic non-disjunction errors, copy number variants (CNV's), single gene disorders and polygenic disorders [7].
Cytogenetic aberrations are included among the main genetic causes of infertility [4,8]. Therefore, identification of chromosomal loci frequently involved in aberrations can help in identifying the genes/pathways involved in infertility using in-silico tools. Keeping these facts in mind, the present study uses a combination of cytogenetics and bioinformatic tools for the prediction of candidate genes which are actively involved in the pathogenesis of infertility.

Cytogenetic analysis
In order to study the cytogenetic aberrations associated with infertility, 201 infertile patients (100 males and 101 females) and 201 age and gender matched controls (100 males and 101 females) from a North-Indian population of Punjab, India were karyotyped after 72 h peripheral blood lymphocyte culturing and GTG-banding. The inclusion and exclusion criteria for recruitment of the infertile patients are given in Additional file 1: Table S1. The phenotypic presentations of the infertile males and females are given in Additional file 1: Table S2. The patients included in the present study were clinically diagnosed as infertile after failure to conceive via natural methods, with medications and also experiencing in-vitro fertilization (IVF) failure. These include a subset of patients wherein the fertility assessment parameters (spermiogram in males, hormonal profiles and reproductive imaging in females) were within the standard clinical limits in both the male and female partners undergoing IVF (Additional file 1: Table S2).
The cytogenetic analysis of the cultured peripheral blood lymphocytes of the patients and controls involved scoring of chromosomal aberrations as total metaphases showing any chromosomal aberration (TAM), metaphases showing only numerical aberrations (TMNA), metaphases showing only structural aberrations (TMSA), and metaphases showing both structural and numerical aberrations (TM(NA + SA)) in 50 to 100 metaphases per subject. The comparison of frequency of chromosomal aberrations in cases and controls was done using Student's t-test. The cut off p-value adopted for statistical significance was 0.05.

Bioinformatic analysis
The cytogenetic analysis helped in the identification of several chromosomal regions with a significantly higher frequency of structural aberrations among the infertile patients as compared to controls. The genes harbored within these loci were assessed by in-silico tools to predict candidate genes and pathways which might be impaired in infertile males and females. The cytogenetic loci observed within these regions were used as the input query for National Centre for Biotechnology Information (NCBI) Gene database (https:// www. ncbi. nlm. nih. gov/ gene) to identify the constituent genes. The data provided by NCBI Gene was filtered according to species ("Homo sapiens"), chromosome number (chromosomal regions not queried were removed) and number of exons (only genes containing one or more exons were included).
Cytoscape v3.8.2 [9] was used to generate various interactive biological networks from the genes annotated to the different chromosomal regions. In the present study, the 'Gene Set/Mutation Analysis' tool of the 'Reactome Functional Interaction (FI)' Cytoscape plugin [10] was used to generate the different interaction networks. For this purpose, the 2019 'Reactome FI Network' dataset and 'Show genes not linked to others' options were used to create interaction networks without the addition of any linker gene. The cytoHubba plugin [11] within Cytoscape was used to identify the various hub genes within the male and female networks. The set of genes located within the different networks was used as the input for the web-based tool, Metascape [12], to identify the genes and pathways enriched within the infertile males and females.

Cytogenetic analysis
The comparison of chromosomal aberrations between the infertile cases and age-matched controls revealed a significantly higher mean frequency of aberrations among the infertile cases (Table 1). A similar trend was observed upon segregating the cases and controls by gender and type of infertility (primary versus secondary infertility) (Tables 2, 3). Among the infertile patients, 5 males and 8 females were identified as carriers of constitutional anomalies. These patients were removed from further analysis resulting in 188 infertile patients (95 males and 93 females) and 188 age-matched controls (95 males and 93 females) remaining for further analysis. A significantly high mean frequency of structural aberrations (deletions, chromatid/chromosomal breaks and gaps) was identified in certain chromosomal regions within these subsets of patients (Table 4). The representative karyotypes for a subset of infertile patients and healthy controls are depicted in Additional file 1: Table S3.
Upon segregating the patients based on type of infertility (primary vs. secondary infertility), chromosomal regions with a significantly high mean frequency of structural aberrations were identified only within the primary infertility patients ( Table 5). The cytogenetic loci affected within these regions (Tables 4 and 5) were subjected to bioinformatic analysis.

Bioinformatic analysis
NCBI Gene returned a list of the genes present at the cytogenetic loci queried: 731 genes in the male dataset and 901 genes in the female dataset. Querying Reactome FI with the aforementioned gene sets led to the generation of a network of 99 genes in the male-specific network ( Fig. 1) and 109 genes in the female-specific network (Fig. 2). Further analysis by cytoHubba led to the identification of hub genes within the male (PSMD3, PSME3, and CDC27) and female (UPF3B, IRF8, and PSMB1) networks. Metascape identified "skeletal system morphogenesis" as the top enriched term within the male   infertility network (Fig. 3) and "mRNA transport" as the top enriched term within the female infertility network (Fig. 4).

Discussion
The human interactome is a highly complex network of functionally interacting cellular components, including a multitude of genes, proteins, metabolites, and RNA molecules [13]. It is now believed that many diseases manifest as a result of disruption of biological cascades due to altered interaction of various network components [14]. Among the structural aberrations identified in the present study, deletions, chromatid/chromosomal gaps and breaks were the most frequentin the infertile patients. In the present study, terminal deletions were observed in the 6q, 16q, and Xq region in infertile females. Deletions, either terminal or interstitial, result in loss of chromosomal segments and a subsequent haploinsufficiency of the gene(s) located in the deleted segments [15].Besides deletions, chromatid/chromosomal gaps and breaks were observed in both males (5q2, 10q2, and 17q2) and females (6q2 and Xq2). These aberrations occur as a consequence of DNA damage through exposure to physical and/or chemical agents, or as a result of recombination events [16]. If left unrepaired, chromosomal breaks can result in deletions (small-or large-scale) and translocations [17].
The top enriched category within the female infertility network was GO:0051028-"mRNA transport" (Fig. 4) Table 5 List of chromosomal regions with a significantly higher frequency of structural aberrations in males and females diagnosed with primary infertility  Table S5). Among the female infertility network, sixty-eight genes had literature published on roles in female infertility (Additional file 1: Table S5).
In the male infertility network, the top 3 hub genes identified were PSME3, PSMD3, and CDC27. Research on mice models have shown that double knockout of Psme3 and Psme4 results in complete infertility in males [18]. In an additional report, male mice with PSME3 (also known as REGγ) deficiency exhibited subfertility due to a decrease in the activity and concentration of spermatozoa [19]. The comparison of gene expression profiles of high motility sperm samples between healthy normozoospermic and asthenozoospermic individuals showed that PSMD3, CDC27 and many other proteins involved in protein polyubiquitination were significantly downregulated in asthenozoospermic individuals [20]. The ubiquitin-proteasome system (UPS) has been reported to play an important role in sperm capacitation and fertilization [21]. Therefore, the UPS components involved in the sperm proteasome can be considered as potential candidates for further research on male infertility.
In the female infertility network, the top 3 hub genes identified were UPF3B, IRF8 and PSMB1. Copy number variation in the 6q27 region (which includes PSMB1) have been speculated to be the cause of premature ovarian failure (POF) in a patient from a POF cohort [22]. In Fig. 1 Biological interaction network generated using Cytoscape v3.8.2 for the male infertility dataset recent publications, IRF8 positive cells were reported to be increased during the proliferative phase of the menstrual cycle in the endometrium of women with endometriosis [23]. Additionally, IRF8 and MEF2C have been reported to be regulated at both mRNA and protein level in the endometrial epithelium during the window of implantation [24]. Upf3b was predicted to be a target gene for the rno-miR-141-5p microRNA. This miRNA was reported to possibly play a role in modulating endometrial receptivity in rats with endometriosis [25].  Currently, limited reports are available on the roles of these genes in maintenance of female fertility, warranting further research on these candidates.
Analysis of the predicted loss-of-function (pLOF) variants in the Genome Aggregation Database (gno-mAD) browser [26] suggests that the hub genes, CDC27, PSMD3, PSME3 (male-specific network), PSMB1, UPF3B (female-specific network) are intolerant to loss-of-function variants. In the clinical setting, microarray-based comparative genomic hybridization (aCGH) coupled with multiplex ligation-dependent probe amplification (MLPA) would be a better alternative to identify genomic imbalances within infertile patients having structural aberrations (especially deletions) within the chromosomal regions harboring these genes [27].
There are few limitations associated with the present study. A cytogenetic approach has been used in the present study to identify possible candidate genes located in chromosomal regions with a high mean frequency of structural aberrations in infertile patients, compared to healthy control individuals. GTG banding has been used for cytogenetic analysis. Compared to other microscopybased alternatives, G-banding has a lower resolution [28]. Finally, there is no expression-based data for the present dataset which can reveal the differentially expressed genes associated with the infertility subsets.

Conclusion
The present study has identified several candidate genes associated with male and female infertility based on information of aberrations available from chromosomal analysis in G-banded cultured peripheral blood lymphocytes. Among the hub genes, the PSMB1 (femalespecific network), PSMD3, and PSME3 (male-specifc network) are components of the proteasome complex. Currently, limited research has been conducted in human infertility on the roles of most of the genes predicted in the present study with a majority of the available reports limited to murine models. Therefore, future research may focus on determining the role of these genes in the maintenance of human fertility.

Additional file 1:
Clinical characteristics and karyotypes of study participants with supporting literature.